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Abstract 

It is shown that elementary indistinguishability properties of partially polarized mix- 
tures are consistent only with the conventional Hilbert space model of quantum mechanics 
and a few exotic alternatives. This applies even in low dimensions where quantum logic 
and Gleason's theorem give either weak or no constraints. Experimental methods for elim- 
inating the exotic cases (which include quaternionic and octonionic variants of quantum 
mechanics) are described. 
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I. STATEMENT OF THE PROBLEM. 



There is no reason to doubt the validity of the conventional model of quantum me- 
chanics wherein pure states x are identified with projectors ir(x) = \x >< x\ in a complex 
Hilbert space that may be finite or infinite dimensional, and the attenuation of an x-beam 
by a y-filter is given by 



In spite of its success there seems to exist no demonstration of the necessity of the con- 
ventional model, and so from time to time there have been speculations that other models 
might exist (possibly less abstract ones!) that agree with it to the extent that predictions 
have been verified. The value of such speculations lies in the possibility of identifying new 
and subtle effects distinguishing various models. A historic precedent is found in geometry: 
Euclid's axioms codified our phyisical intuition about the structure of a plane. Cartesian 
coordinates were subsequently introduced as a model for the plane, and this worked so 
well many people came to think of the Euclidean plane as no different than the Cartesian 
plane. But then it was discovered , H that certain theorems, e.g. the theorems 
of Desargues and of Pappus, are deducible in the Cartesian plane but do not follow from 
Euclid's axioms. When one investigates the reason one learns (among other surprising 
things) that while the Cartesian plane can be embedded in a higher dimensional space, 
not all Euclidean planes can! Thus the physical property of embeddability in a three space 
is actually a fundamental piece of information about the plane as we experience it whose 
significance we would not have appreciated had we not inquired into the necessity of the 
Cartesian model. 

Analogously a demonstration that a certain elementary set of physical facts compel 
us to adopt the conventional model of quantum mechanics rather than any other will 
reveal that certain ingredients of the conventional model (whose significance we may have 
overlooked) are quite essential in distinguishing that model from others. The purpose of 
this paper is to provide such a demonstration. 



We shall place ourselves in the position of an experimentalist who is unpredjudiced 
as to what model will describe the data. This data will be a table, to be called a p-table, 
consisting of measured attenuations < p(x\y) < 1 as x,y range over a set S of filters. Our 
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task is similar to that of inferring the geometry of the earth from a table of road distances 
between cities || , and we can make this analogy quantitative by noting that there is a 
remarkably simple way 0] of "metrizing" the p-table as follows: Let 

dL(x,y) = sup|p(z, x) -p(z,y)\, d R (x,y) = sup|p(x,z) -p(y,z)\, 
zes zes 

d{x,y) = max{d L (x,y), d R (x,y)}. (2) 

Since we make no a priori assumptions about the underlying physical structure of the 
filters we recognize equivalence x = y of filters from p(x\z) = p(y\z) and p(z\x) = p(z\y) 
for all z G S. One then readily checks that d has the three properties required of a metric 
i.e. 

x = y d(x, y) = 0, d(x, y) = d(y, x) 

d(x, y) + d(y, z) > d(x, z), Vx, y, z G S. (3) 

The only assumption about p used in establishing (3) is that it is real (in fact one one 
needs only that it belongs to a "valuation ring" ) . Thus S has now become a metric space 
under d, which, because of its generality, we call the universal metric. 

The symmetry group Q of S is the group of its permutations that preserve p i.e. 
maps z — > z such that p(x\y) = p(x\y), Vx, y G S. Study of the symmetry group Q 
is an obvious tool for analyzing the structure of S. In the conventional model there is 
a symmetry, which may be called exchange homogeneity, that exchanges any given pair 
of filters. While this is simply stated it is not so easy to test experimentally. However, 
it has an elementary consequence that is easy to test namely that p(x\x) — p(y\y) and 
p(x\y) = p(y\x), Vx, y G S. We shall take this as our first assumption about the p-table. 
Experimentally one finds the common value of p(x\x) to be unity. 

Thus we shall assume: 

p(x\x) = 1, p(x\y) = p(y\x) Vx, y G 5, (4) 

whence we may rewrite d in the simpler form: 

d(x,y) =sup \p(x,z) -p(y,z)\. (5) 
zes 

It is important to note that we do not have to assume the converse of the first property 
in (4) i.e that p(x\y) = 1 implies x = y, for we shall in fact be able to derive it below. In 
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this connection one may contrast our approach with Mielnik's analysis of quantum 
mechanics via the function p where p{x\y) = 1 defines equivalence of filters x, y rather than 
our definition d(x, y) = 0. 

From its definition and the restriction < p < 1 we see that < d < 1, and from 
(4) that any set x\, X2, • ■ ■ such that p(xj\xk) = for j ^ k are mutually separated by the 
maximal amount d = 1. We shall say that x,y are "orthogonal" if p(x\y) = and define 
a "basis" as any maximal set T of mutually orthogonal elements. While we introduce 
these terms because they correspond to analogous terms in the conventional model, it is 
important to keep in mind that we have as yet no vector space in our construction, and 
hence the reader is warned not to ascribe properties to the terms beyond their definition. 

It is to be noted that in classical physics, where probabilities become certainties, p 
only assumes the values 0,1, so all of S is a basis. More generally a basis T represents a 
maximal subset of S that behaves classically. In the semi-classical case it will be possible 
to partition S into disjoint subsets that are mutually orthogonal. We may imagine that 
this process of "reducing" S to the union of orthogonal components is carried out until it 
can no longer be done, and the final components are then said to be irreducible. 

Let us next give an argument showing that the inherent limitations of experiment are 
such that we can assume without loss of generality that S is a compact metric space, i.e. 
that infinite sequences have limit points: 

In any experiment there will be a parameter e > indicating the range of error, and 
we can regard two models pi,p% as "e- equivalent' if \pi(x\y) — P2(x\y)\ < e, Vx,y G S. 
Moreover we must be able to decide the e-equivalence of two models with a finite number 
of measurements that may of course increase as e becomes smaller. Now let S(x,e) be the 
"ball" consisting of points y G S such that d(x, y) < e. Suppose we construct an e- "cover" 
with a set of balls S(xj, e), j = 1,2, • • ■ M. The triangle inequality implies that \d(x, z) — 
d(xj , z) | < e, W G S. Thus the reqirement that we be able to determine the accuracy of 
the model by a finite number of measurements is implemented by requiring that there be a 
finite cover for every e. The topological term for this property is "total boundedness" |§ . 
Moreover from an experimental point of view, a filter is indistinguishable from a filter that 
is sufficiently close in the <i-metric, so we may also assume that every descending sequence 
of closed balls with e — >• contains only one common point. The topological term for this 
is the "condition of Ascola" . It can then be shown that the space is "complete" , i.e. that 
Cauchy sequences converge. Moreover one can show that a metric space is compact if 
and only if it is complete and totally bounded. Thus to any desired degree of experimental 
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accuracy our p-table can be associated with a compact metric space. This means that 
there is at most a finite number of elements in any maximally separated set and hence 
that any basis T must contain a finite number of elements. Note that we do not have to 
assume that this number is the same for every basis as we shall be able to deduce this 
below. In the conventional model our conclusion here corresponds to the the fact that to 
any given experimental accuracy the predictions made using infinite dimensional Hilbert 
spaces — which are not even locally compact — can be approximated by restricting to a 
compact subspace i.e. a Hilbert space of sufficiently large but finite dimension. 

As we noted earlier we expect that the mathematical structure of S will be determined 
by its symmetry group Q. In addition to the exchange homogeneity, of which (4) is a 
consequence, the conventional model enjoys two others worth noting at this point, for they 
will play an important role in our discussion below: 

Consider the subgroup Q{T) of Q that fixes every element of a basis T. In the con- 
ventional model these are unitary transformations obtained by exponentiating hermitian 
operators for which the states projected by the elements of T are an eigen-basis. Since 
such operators commute one sees that Q{T) is a commutative group. This property of the 
conventional model is the one used to argue that integrals of the motion commute with the 
hamiltonian. We shall see at the end of the paper that it is precisely this property that 
distinguishes the conventional model from certain exotic possibilities that are consistent 
with the other assumed properties of the p-table. 

Next we note that the conventional model enjoys a symmetry property known as 
pairwise homogeneity, i.e. given two pairs of elements a, b and x,y such that p(a\b) = 
p(x\y), then there exists a map in Q that takes a to x and b to y. This property has a 
remarkable consequence: In general (5) does not determine p(x\y) from d(x,y). However 
if pairwise homogeneity holds one sees that if p(a\b) = p(x\y) one can use the p-preserving 
mapping a — > x and b — > y to argue that the right sides of (5) will be the same so that 
d{a,b) — d(x,y). Thus there will exist some functional relationship: 

d(x,y) = f(p(x\y)). (6) 

Since / is also monotone in the conventional model (see Appendix), its inverse exists and 
the isometry group with respect to d can be identified with the symmetry group Q. 

While it is tempting to assume the pairwise homogeneity property we shall not do 
so for the same reason that we avoided assuming exhchange homogeneity above, namely 
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that it is not a propertly that lends itself to easy experimental test. Rather we shall focus 
on the relationship (6), seeking an easily verified phenomenon that determines the form 
of (6). Remarkably we will find that this phenomenon determines all that we could have 
extracted from an assumption of pairwise homogeneity. 

III. INDISTINGUISHABILITY PROPERTIES OF MIXTURES. 

Since interference is the hallmark of quantum mechanics, one suspects that the sought- 
after phenomena must be of this kind. However, we have the problem that interference is 
normally expressed in terms of phase relations in superpositions of states, and the notion 
of "phase" has no meaning prior to the formulation of a complex vector space model. Thus 
we must first recognize those interference phenomena that can be expressed in a model- 
independent way i.e. directly in terms of entries in the p-table. Such relations can be 
obtained from the study of mixtures. A mixture Ai consisting of a fraction otj of particles 
in the state Xj, < aj < 1, j ; = 1, 2, • • • , N will be denoted: 

3 3 

It must be understood clearly that prior to the construction of a model this is merely a 
formal shorthand since the notion of linear combination of filters is not defined. However, 
the fraction of M. that passes a z filter is well-defined, i.e. 

P(M\z) = Y i a j p(x j \z), (8) 

3 

so that equivalence of filters can be defined by: 

M = M'^P(M\z) = P(M'\z) VzeS. (9) 

The kind of interference pheomena we are seeking will be expressed as statements of equiv- 
alence between mixtures constructed in different ways. Unlike interference of amplitudes 
they are formulated directly from data in the p-table without benefit of a vector space 
model. In the Hilbert space formalism interferences of mixtures occur because off-diagonal 
elements of density matrices are in general complex numbers and so may cancel upon 
addition. 

Using any basis {x}n = %i, X2, • • • we can construct an unpolarized mixture: 

N 

= Z> _1 Zi, (10) 
i=i 
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containing equal fractions of each of the basis states. Then the first of the experimentally 
verifiable interferences of probability that we are going to assume is quite familiar: All 
unpolarized mixtures are equivalent. Thus, e.g. one cannot distinguish an equal mixture 
of left and right circularly polarized light from an equal mixture of orthogonal linearly 
polarized states. Thus we assume: 

U({x} N ) = U({y} N >) for any bases {x} N , {v}n>, 

or equivalently 

N N' 

N- 1 ^2p(xj\z) = N'- 1 Y,P(yk\ z ) ^ z G S - ( n ) 

j=l k=l 

We will see below that (4) implies N = N'. In the conventional model (11) follows from 
the fact that the density matrix for unpolarized mixtures is a multiple of the unit matrix 
and hence the same in any basis. But (11) is also consistent with various hidden variable 
models, and thus will not suffice to distinguish the conventional model. However we observe 
that there is another elementary indistinguishability property that can be observed using 
partially polarized mixtures. There are two simple ways to make such mixtures: by taking 
an equal mixture of non-orthogonal filters or an unequal mixture of orthogonal filters. It 
then turns out that for any mixture of the one type there is an equivalent mixture of 
the other. More precisely: Given any a, b there is an orthogonal pair c, d and a number 
< A < 1 such that: 

|a+ \b = Xc+ (1 - X)d, 

or equivalently 

\p{a\z) + \p{b\z) = Xp(c\z) + (1- X)p(c'\z), VzeS. (12) 

In the conventional model one deduces (12) by diagonalizing the density matrix associated 
with the left side obtaining a result with non-vanishing elements in a two-dimensional 
subspace associated with the right side. From our point of view it is a simply testable 
property described in a model independent way that, as we shall see, contains almost all 
that we need to deduce the conventional model. 

IV. STATEMENT OF THE MAIN THEOREM . 

We now summarize the results of our discussion above and formulate the main theorem 
to be proved below: 
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We assume that a function < p(x\y) < 1 called an attenuation function is given on 
pairs of elements in a set called the space of filters. This becomes a metric space under the 
universal metric d determined by p (see (2)). Any compact subspace (with respect to d) 
can be decomposed into the finite union of subspaces that are mutually orthogonal in the 
sense that p(x\y) = when x and y belong to different subspaces and are irreducible in 
the sense that they cannot be further decomposed in this way. We refer to these compact, 
irreducible subsets as components. A typical component is denoted S. A maximal set T of 
mutually orthogonal elements of S is called a basis of 5, and it follows from compactness 
that the number of elements in any such set is finite so that it makes sense to define 
unpolarized mixtures by (10). 

In the following 1Z, C, Q and Cay refer to the real numbers, complex numbers, 
quaternions, and octonions (Cayley numbers) respectively. 

The main theorem tells us the structure of S subject only to the following assumptions 
about the attenuation function: 

(i) p(x\x) = 1, p(x\y) = p(y\x), Vx, y E S. 

(ii) Unpolarized mixtures are indistinguishable (11). 

(iii) The averaging property (12) of partially polarized mixtures holds. 

§(4.1) Main Theorem 

§(A) All bases have the same number of elements N called the "dimension" of S. 
§(B) The attenuation function p is related to the universal metric by the formula: 

d(x,y) = {l-p(x\y)} 1 / 2 . 

In particular this means that isometries with respect to d are symmetries of S. 
§(C) For N = 2 there is an isometry of S to a unit sphere of finite dimension m on which 
the metric is the great circle arc-length. The two elements of a basis lie at antipodes, and 
p(x\y) = cos 2 (9(x, y)/2), where 9 is the great circle arc-length between the corresponding 
points. In the cases m = 1, 2, 4, 8 the sphere is also a projective space over 1Z, C, Qor Cay, 
respectively and the trace rule (1) forp holds. 

§(D) For N = 3 the set S can be mapped to a projective space over TZ, C, Q or Cay, in 
such a way that the trace rule (1) forp holds. 

§(E) For N > 3 the result is as for N = 3 except that Cay is excluded. 
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§(F) The conventional model of quantum mechanics (which corresponds tolZ orC projective 
spaces above) is the unique model for which the subgroup of isometrics fixing the elements 
of a basis is commutative. 

In the course of the proof we shall indicate how our main theorem relates to the 
quantum logic of Birkhoff and von Neumann 0,0], to Gleason's theorem [H , and to 



the Jordan algebra axiomatization scheme |T(J . The physical significance of §(F) of the 



main theorem will be discussed at the end of the paper. 

The proof of §(A) is quite simple: Insert z = yi, and sum both sides of (11) over I; 
insert z = x n , sum both sides over n, interchange, and use (4) to deduce that N = N'. 
We also have the important corollary: 

§(4.2) p is a "frame function" j^j i.e. : 

N 

^^p(xj, \z) = 1, Vz, for any basis {x}n, (13) 

To see this observe that any z can be made part of a basis by adjoining orthogonal elements 
to it until a maximal set is obtained. The assertion then follows from (11). Note that (13) 
is the familiar assertion in quantum mechanics that the sum of the probabilities is unity 
for a particle to pass the filters belonging to a basis. 

Proofs of the remaining parts of the main theorem are more complicated and appear 
in various sections below. 

V. DEDUCTION OF THE RELATION (B) OF p TO THE METRIC. 

Two lemmas will be needed: Let n < N where iV is the dimension of the system and 
let {v} n = V2, ■ ■ ■ , v n be a mutually orthogonal set. Then we define the subspace S* of 
S spanned by {v} n as the set of states u such that: 

n 

J2p(vj\u) = 1. (14) 

The use of the term "subspace" is justified by the following lemma: 

§(5.1) If {w} n is any other mutually orthogonal set of n elements in 5*, then it also 
spans S* . Moreover 

n n 

^2p(wj\z) = ^2p{vj\z) Vzg5, (15) 
the common value being unity if z G S* . 
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Proof: Let {v} n be extended to a basis of S by adjoining orthogonal elements 
v n +i, ■ ■ -Vjsf. Then from the definition and (13), p(vj\u) = for u G S* and j > n. Hence 
p(wk\vj) = for j > n and hence w±, • • • , w n , v n +i, • • -v jv is a basis of S. Comparing (13) 
for these two bases the assertion follows.* 

Note that §(5.1) shows that it makes sense to speak of the dimension of a subspace as 
the (unique) number of elements in any maximal mutually orthogonal set in the subspace. 
Another useful corollary is the following: For given x consider the one dimensional subspace 
consisting of the set of elements z such that p(x\z) = 1. Since it is one-dimensional any 
element y for which p(x\y) = 1 also spans it, and hence, by §(5.1), p(x\z) = p(y\z) for all 
z G iS, i.e. x = y. Combining this with (4) we have proved: 

x — y *-* p( x \y) — i- (16) 

We thus deduce the equivalence criterion assumed by Mielnik || noted above. With (5) 
we also obtain the converse of a relationship noted above between bases and maximally 
separated elements: 

§(5.2): Every maximally separated set of elements (in the sense of the d-metric) is a 
basis and vice versa. 

We have not yet used (12) but will now do so in proving: 

§(5.3) If a 7^ b there is a unique two-dimensional subspace V a b containing a, b as well 
as c, c' appearing in (12). 

Proof : Let c\ = c, c^= d and extend to a basis of S by adjoining N — 2 orthogonal 
elements C3, ■ • • , c/v- With z = Ci, i > 2 the right side of (12) is zero and since the quantities 
on the left are non-negative, we must havep(a|ci) = p(b\ci) = for i > 2. Hence, by (13,14), 

a, b are in the two-dimensional subspace spanned by c, c' Now if a, b are distinct A ^ 1. For 
otherwise p(a\c') = p(b\c') = 0, and so a, b lie in the one-dimensional subspace spanned by 
c, i.e. a = b. By similar argument A 7^ 0. Now suppose a, b belong to a subspace spanned 
by some other orthogonal pair d, d! . Then inserting z = d and z = d! into (12) and adding 
one has: 1 = X(p(c\d) + p(c\d')) + (1 — X)(p(c'\d) + p(c'\d')). Since < A < 1 this can be 
satisfied only if p(c\d) + p(c\d') = p(c'\d) + p(c'\d') = 1, which means that d,d' and c, d 
define the same subspace.* 

From §(5.3) we can make the following definition: lib ^ a then the antipode a' of a 
relative to b is the unique element in V a b such that p(a\a') = 0. Moreover since a, a' and 

b, b' span V a b as well as c,d in (12) we have proved that if a 7^ b then: 

p(a\z) +p(a'\z) = p(b\z) +p(b'\z) = p(c\z) +p(c'\z), Vz G S, (17) 
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the common value being unity if z G V a b- 

For later use we take note of three corollaries: 

§(5.4): Vab is determined by any pair of its distinct elements. Thus if Vi and V2 
are two-dimensional subspaces, they are either disjoint, identical or have just one common 
point. 

This will be exploited later on in demonstrating that our space has among other things 
the structure of a projective geometry. Note: We shall say that a pair of two-dimensional 
subspaces are adjacent if they have exactly one common point. 

§(5.5): If z is orthogonal to two distinct points of a two dimensional subspace it is 
orthogonal to every point of that subspace. 

We say that a pair of adjacent two-dimensional subspaces are normal to one another 
if the antipodes of the intersection in the two subspaces are orthogonal to one another. 

§(5.6): If z is the intersection of a pair of normal two-dimensional subspaces then its 
antipode z' on one of the two subspaces is orthogonal to every point of the other. 

We are now ready to prove (B) of the Main Theorem §(4.1). 

If a = b, p(a\b) = 1 and there is nothing to prove. If a 7^ b then let b' be the antipode 
of b relative to a. If b' = a, so that p(a\b) = we already know that d(a, b) = 1, and (B) 
follows. Thus we may assume that a^6, b' . Equations (12, 17) can be combined to give: 

p(a\z) -p(b'\z) = (2X-l)(p(c\z) -p(c'\z)) Vz G S. (18) 

Taking the supremum over z, the definition (5) of d gives: 

d(a, b') = |2A - l\d(c, c') = |2A - 1|, (19) 

whence, since we assume a 7^ b', it follows that A 7^ 1/2. From (17) when z G V a b so that 
the common value is unity one obtains: 

p(a\z) - p(b'\z) = (2A - l)(2p(c|z) - 1) = p(b\z) - p(a'\z). (20) 

In particular for z = a and z = b: 

p(a\b) = (2A - l)(2p(c\a) - 1) = (2A - l)(2p(c|6) - 1), (21) 

so that since A 7^ 1/2: 

p(c\a)=p(c\b). (22) 
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Then from (17) again with z = c 

X = p(c\a), (23) 

so that from (21) 



|2A-1| = y/p(a\b) = y/l-p(a\V), (24) 
which with (19) and the substitution x = a,y = b' gives 

d(x,y) = {l-p(x\y)} 1 / 2 , (25) 

which is (B) of §(4.1) .• 

Since d uniquely determines p from p = 1 — d 2 it also follows that every isometry with 
respect to d is also a symmetry of S as defined above. 

The proof that (25) agrees with the conventional model is given in the Appendix where 
the proof [1(J2 is also reproduced that this relationship is distinct from the one obtained 



in locally realistic (hidden variable) theories. Thus it follows that although (11) can be 
reproduced by locally realistic theories, the relation (12) cannot! 

VI. CONVEXITY . 

We next use (12) and the fact that S is irreducible to establish a basic convexity 
property of S that will be needed in the remaining parts of the proof of the main theorem. 

Let us examine the uniqueness of A, c, d in (12) for given a, b. Assume that p(a\b) ^ 
0, 1. One notes that (24) has exactly two solutions for A, one in the interval 1 > A > 1/2 
and the other its image under A — > 1 — A. The exchange of these two is equivalent to 
exhchanging c and d in (12). If we fix A then the c satisfying (12) is unique for the 
following reason: Since §(5.1) implies c G V a b, §(5.3)implies that for any z G V a b we must 
have p(d\z) = 1 — p(c\z) and hence from (12): 



\p{a\z) + \p{b\z) = {2\-l)p{c\z) + {l-\) VzeV ab . (26) 

Hence if A > 1/2 is given it follows that if c\ and C2 are two different choices for c G V a b, 
then p(c\\z) = p(c2\z) for all z G V a b- In particular with z = c\ we obtain p{c\\c2) = 1 so 
that ci = C2 by (16). This establishes the uniqueness of c for given A > 1/2, and we can 
give it the following geometric interpretation. Define the arc 6(a, b) through the equation: 

p(a\b) = cos 2 (#(a, b)/2), < 9 < tt, (27) 
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so that 



A = cos 2 (6>(a,6)/4) 



(28) 



is the solution of (24) with A > 1/2. Then from (22,23) we have 



9(a,c) = 9{c,b) = 0{a,b)/2. 



(29) 



We have thus established: 

§(6.1) If p(a\b) 7^ 0, 1 there is a unique point c in V a b that may be called the midpoint 
of a and b satisfying 9(a,c) = 6>(c, 5) = 9(a,b)/2. In other words S is an "M-convex" 
metric space. []12| 

In order that §(6.1) be interesting it is necessary that there exist a, b for which 
p(a,b) 7^ 0,1. But this follows from the irreducibility of S. We will see below that it 
also implies that S is connected. 



VII. PROOF OF §(4.1C) AND THE STRUCTURE OF TWO DIMENSIONAL SUBSPACES. 



First we must caution the reader about the use of the term "dimension" in the following 
discussion. Recall that the dimension of a subspace S* of S is the number of mutually 
orthogonal elements (as defined by p) or equivalently the maximal number of points with 
maximal separation. One should not confuse this with the dimension m of a space as 
a manifold. Thus e.g. an m-sphere has dimension m as a manifold but has exactly two 
maximally separated elements for any m > 1. In particular linearly polarized photon states 
are represented by points on a circle (1-sphere) whereas the set of all polarization states 
corresponds to the points of the Poincare sphere (a 2-sphere). In both cases there are 
exactly two orthogonal states in any basis. Since linearly polarized states are described by 
a real two-dimensional Hilbert space, whereas the set of all polarization states requires a 
complex two-dimensional Hilbert space, one sees that the increase in manifold dimension 
is associated with the enlargement of the coefficient field required, not in the number of 
orthogonal states. 

We shall say that a subset V* of V is properly mapped to a unit m-sphere if for every 
pair of points a, b in V* , the great-circle arc 8(a, b) joining the corresponding points satisfies 
p(a\b) = cos 2 (6>(a, b)/2). If some V* is properly mapped to a unit m-sphere, then one can 
adjoin the antipodes of all elements, and the extended set is still properly mapped to the 
m-sphere. This follows simply from the fact that the antipode a' of a will, in virtue of 
p(a\z) + p(a'\z) = 1, Vz G V, give the correct arc length to 9(a',z) whenever 9(a,z) is 



12 



correct. We now show that V* can be enlarged so that together with any pair of its points it 
also contains the midpoint: First rewrite (26) in terms of arcs with a, f3, 7, 9 being the arcs 
of p(a\z),p(b\z),p(c\z), and p(a\b) respectively (see Figure 1). After some trigonometric 
manipulation it becomes: 

|(cosct + cos/3) = cos7cos(6>/2). (30) 

This equation has a remarkable significance: Suppose that it is possible to put z,a,b on 
a two-sphere in such a way that the great-circle arc lengths between them agree with the 
oe,(3,0 defined above. In sphereical trigonometry it is shown [I3| that (30) is the formula 



giving the length 7 of the great-circle arc joining z to the mid-point of the arc connecting 
a, b. Thus it would follow that c of (12) is properly mapped to that same 2-sphere! But 
if three points can be properly mapped to an m-sphere they also belong to a 2-sphere 
(possibly degenerating to a or 1 sphere) within it. Thus starting with any subset V* we 
can continue to adjoin midpoints of all pairs, antipodes of all elements, and finally take 
the closure in the metric topology to form a set [V*] called the m-closure that is closed 
with respect to all three operations. Thus we have: 

§(7.1): IfV* can be properly mapped to an m-sphere then so also can its m-closure 

[V*]. 

From the assumption of irreducibility it follows that given any element x in V there 
must exist in addition to its antipode x' at least one element y such that < p(x\y) < 1. 
But clearly the set consisting of the three points x, y, x' can be mapped to a unit-circle 
with x,x' at opposite points of a diameter. By §(7.1) so also can its m-closure. But one 
sees from the construction of the m-closure that this is simply the unit circle itself. Thus 
we have proved: 

§(7.2) V is connected. In fact every pair of points a,b G V can be connected by a 
circular arc 6(a,b) in such a way that d(a,b) = sin(0(a, b)/2) and so is just one-half the 
length of the chord of the circular arc connecting the two points. 

We next introduce a useful tool for analyzing the geometry: For any x in V we define 
the equator S(x) opposite x as the set of points c with the the property 9(x, c) = 9(c, x') = 
7r/2. From connectedness this set is not empty. Clearly p(x\c) = 1/2 — > p(c'\x) = 1/2 so 
c G £ (x) — » d G £(x). Also if a, b G £(x) are not one another's antipodes so that A 7^ 1/2 
in (26), then solving (26) with p{a\x) = p(b\x) = 1/2 gives p(c\x) = 1/2. Thus 

§(7.3) The midpoint of two points in £(x) is also in £(x). 
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Thus £(x) is a proper subset of V (since x,x' £ £{x)) that is closed with respect to 
inclusion of antipodes and midpoints just as V itself. If S{x) is non-empty we can select 
an arbitrary point say X\ and define the equator £\{x\) opposite X\ as the subset of £{x) 
consisting of points y satisying p(y\x\) = 1/2. We continue in this way to generate a 
sequence xx,X2,---. At some finite m we will encounter an x m for which the equator 
opposite is empty. The reason m must be finite is that every point in the sequence is 
separated from the others by d = l/y/2 in the metric, and hence, by compactness, such 
a sequence can have an at most finite number of elements. Quantum mechanics over the 
reals, complex numbers, quaternions, and octonions (Cayley numbers) would correspond 
to m = 1, 2, 4, 8 respectively. We call the sequence x\, • ■ ■ , x m an equatorial decomposition. 

Now we claim that V is the m-closure of the set V* consisting of any point x and the 
equator S(x). To see this note that if a ^ V* then there is a unique circle containing x, a, x' 
that intersects S(x) in a unique point y, and a is in the m-closure of any set containing 
y and x. Now if x and £ (x) can be properly mapped to a j-sphere, then clearly V* can 
be properly mapped to a j + 1-sphere. By §(7.1) so can its m-closure, and hence so can 
V. Hence, using the equatorial decomposition, we obtain part (C) of the main theorem 
§(4.1) by induction.* 

We now know that a two-dimensional subspace (N = 2) is an m-sphere (m > 1). We 
will call it a generalized Poincare sphere (GPS). To avoid confusion we refer to m as the 
rank of the GPS rather than a dimension. It is the number of points in the equatorial 
decomposition. The term Poincare sphere without the adjective "generalized" is used to 
specify the conventional model (m = 2). Thus in the case of polarized light we may 
represent the pole by the projector of the state (1,0) and the two points of an equatorial 
decomposition can be taken to be the projectors of the states (1, l)/y2 and (l,i)/v2- 
Thus a maximal set with mutual attenuation equal to 1/2 has three elements. If the 
description of polarization required quaternions (N = 2, m = 4) we could take an equatorial 
decomposition (l,l)/\/2, {!,%)/ y/2, (l,j)/\/2, (1, k)/y/2 where i,j, k are the quaternionic 
units, and we would find that there are five "polarization" filters with mutual attenuation 
equal to 1/2 rather than three. In the octonionic case (m = 8) we would replace k by 
the seven octonionic units and have 8 filters with mutual attenuation 1/2. (For arbitrary 
m the units are associated with all possible Jordan algebras [|10| - see below). 

Let us note here that while iV = 2 is conceptually simpler than iV > 2, it is the most 
troublesome case for various axiomatic schemes. Thus, e.g. in the so-called quantum-logic 
approach of Birkhoff and von Neumann , || , no information is gotten about N = 2 
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because the fundamental theorem of projective geometry that they exploit gives non-trivial 
constraints only for N > 2. Moreover even if one assumes that one has a Hilbert space for 
N = 2, it is not possible to deduce (1) from (11) alone using Gleason's theorem || (see 
discussion below). It is thus encouraging that (12) has supplied us with new structure not 
present in quantum logic. We will see next that (12) also gives us more for the difficult 
case N = 3 than one obtains from quantum logic. 

VIII. DIMENSION N = 3. 

We have already shown that the geodesic for iV = 2 is a great circle arc. We want 
to show next that in higher dimensions the great circle arc on the generalized Poincare 
sphere joining two points is still the shortest of all possible rectifiable curves joining them. 

§(8.1) The geodesic connecting any two filters is an arc of great circle lying in the 
generalized Poinceare sphere containing them. 

Proof: d(a, b) is one-half the chord length of the circular arc between a, b in V a b- The 
great circle arc joining a, b has length 9(a, b)/2 in this metric. Hence if T is a rectifiable 
curve of shortest length </>(a, b) one has: 

sin(0(a, b)/2) < </>(a, b) < 0(a, b)/2. (31) 

Mark off points a = a a , oi, a%, ■ ■ ■ , a n = b on the curve so that the segments connecting 
adjacent points have equal length <p/n. Then by (31) this differs in length from the circular 
arc joining a pair of adjacent points by an amount of order (<f)/n) 2 and hence for the whole 
curve the error is of order <p 2 /n — > 0, n — > oo. Hence we can approximate T to arbitrary 
accuracy by a sufficiently large number of circular arcs joining adjacent points. We must 
establish that the shortest length is obtained when those arcs lie on the same circle. To 
do this we proceed as follows: 

Recall that (26) required z G V a b because it was derived from (12) using p(c\z) + 
p(c'\z) = 1. However for arbitrary z one sees that p(c\z)+p(c'\z) < 1. Hence (26) generalizes 
as an inequality for arbitrary z and, when expressed in terms of arcs, (30) becomes in the 
special case a = (3: 

cos a < cos7cos(#/2) < cos(#/2) i.e. a>9/2, (32) 

in which the arcs are no longer required to lie on the same sphere. But this simply says 
that the sum of two equal circular arcs connecting a to z and z to b is not smaller than 
the length of of a single great circle arc connecting a to 6, which is what we had to prove.* 
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§(8.2) All generalized Poincare spheres in S are of the same rank. The rank m of 
these spheres is called the rank of S. 

Proof: From §(5.4) two GPS are either disjoint, have one point in common, or are 
identical. If they are disjoint there is a GPS containing one point from each. Hence it 
suffices to prove the theorem for a pair of adjacent GPS's. Connect the antipodes of the 
intersection point x by a great circle arc and divide this up into a large number of equal 
segments. Then we have a chain of closely spaced points generically denoted y. Clearly 
each such y satisfies p(y\x) = and hence is also the antipode of x on some GPS denoted 
V xy - Thus we need only show that for GPS's with closely spaced y the dimension will be 
the same, i.e. that the number of points in the equatorial decomposition cannot suddenly 
jump for V X y's with nearby y's. But since \d(a,x\) — d(a,X2)\ — > for d(x\,X2) by the 
triangle inequality it follows that if d(a,xi) = 1/2 then d(a, X2) can be made arbitrarily 
close to 1/2 for x\ and X2 close enough. Hence the sequence of V xy J s have an equatorial 
decomposition retaining the same number of points in the limit y — > x\ •. 

If z is a point not on the GPS V we shall define a foot of z on V as a point / of V for 
which d(z, f) is a minimum. We shall see that if z is not orthogonal to V then its foot is 
unique. Moreover we shall deduce an analogue of the Pythagorean theorem namely: 

§(8.3) If x EV then p(z, x) = p(z, z*)p(z*, x) where z* is the foot of z in V. 

To construct the foot z* in V of z £ V and not orthogonal to V proceed as follows 
(see Figure 2): Select an arbitrary pair of antipodes a, a' of V and let a* be the antipode 
of a in V az - This will be distinct from a' since z £ V. Let a" be the antipode of a' in 
Q = V a 'a* and a** be the antipode of a* in Q. Since a is orthogonal to a' and a* it is 
by §(5.5) orthogonal to Q and hence a, a', a" are mutually orthogonal. Also a, a*, a** are 
mutually orthogonal and so, since p(a\z) + p(a*\z) = 1 it follows that p(z\a**) = 0. But 
p(a*\z) + p(a**\z) = p(a'\z) + p(a"\z) by §(5.1) and so p(a\z) + p(a'\z) + p(a"\z) = 1. 
Hence z is in the subspace spanned by a, a', a". One notes that since z is by hypothesis 
not orthogonal to V it will be different from a", and hence the GPS V a " z will intersect 
V at a unique point. We take this to be the foot z* of z on V. Note that V and V a "z 
are normal to one another as defined following §(5.5). Thus §(8.3) can be regarded as a 
formula relating distances between two points on a pair of normal GPS's to their distances 
to the intersection point, in this case at z*. 

To prove §(8.3) we need two lemmas: 

Let zi, Z2 lie respectively on a normal pair of GPS's V\ and V2 with intersection z . 
§(8.4) z is the nearest point ofV2 to any point ofV\. 
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To see this observe that if there were a point x on Vi nearer to a point y on V\ and 
if z' Q is the antipode of z a on Vi, then the curve consisisting of a circular arc connecting 
z' a to y together with the circular arc connecting y to x would be less than a semicircle. 
But they can be connected by a semi-circular arc which by §(8.1) is the shortest distance 
between them.* 

From (18,28) we have 

p(a\z) —p(b'\z) = — sin(0(a, b')/2)(p(c\z) — p(c'\z)). (33) 

Now let the intersection z Q of the two normal spheres be the midpoint of the two points a, b' 
on Vi. Consider all pairs of points on the circular arc C joining a, b' that are equidistant 
from z . Replacing a, b' by these will not change c, c' which will always lie at opposite poles 
one quarter circle from z Q on C. Now let x be a variable point on the circular arc joining 
a, b' . Then for any y on V\ §(8.1) shows that p{x\z) is a minimum when x passes through 
z a . Hence the left side of the last equation must vanish faster than linearly in 6(a, b') as 
this quantity tends to zero. But the first factor on the right of (33) vanishes linearly so 
that we must have p(c\z) —p(c'\z) = and hence p(a\z) = p(b'\z). In other words we have 
proved the lemma: 

§(8.5) IfV\,V2 are normal, two points ofV\ equidistant from the intersection are also 
equidistant from any point ofV%. 

Now if a, b in (12) are points of V2 equidistant from the intersection z Q , then in (12) 
we see that c = z Q . Moreover d is its antipode in V2 and hence by §(5.6) it is orthogonal 
to all points of V\. Using §(8.5) and (26) we obtain §(8.3) •. 

The geometric significance of this theorem emerges if in (27) we put 4>(a, b) — 9(a, b)/2 
so that p(a, b) = cos 2 (^(a, b)) whence our theorem gives: 

cos(4>(zi,Z2)) = cos(4>(zi, z ))cos((f)(zQ, z 2 )). (34) 

But this may be recognized as the fundamental relation between the hypotenuse and legs of 
a right spherical triangle |13|] . Now consider Figure 3. The two points Z\,Z2 lie on the 
same great circle through a. Their feet on V aa > are z\ and z\ . Because of (34) the distance 
from a to z\ and from a to z\ are such that we can deduce that z\ and z% must also lie 
on a single great circle through a. Thus we have the general rule that if we have a great 
circle C through a lying on one GPS, then the feet of the points of C on another GPS also 
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form a great circle through a. We can formulate this result as the following very important 
corollary: 

§(8.6) There is an equivalence relation "~" between great circles on GPS's through a 
given point x namely: C ~ C* if the points of C are the feet on one GPS V of the points 
of C* on another GPS V* . In each class there is one and only one circle from each GPS 
through x. 

We call the set consisting of all points other than the intersection x that lie on one 
of the equivalent circles within a given class a real cross-section of S. This terminology is 
based on the fact that in the case of a real Hilbert space the GPS's are themselves circles, 
whence the real cross section is the whole of S (other than x. Note that in general we can 
partition S into the disjoint union of real cross sections plus x. 

We shall also need some other corollaries: 

§(8.7) The distance of a point on one GPS to its foot on an adjacent GPS is the same 
for all points at a given distance from the intersection. Moreover the distance of the foot 
to the intersection is the same for all such points. 

Proof: Let V, Q be two GPS intersecting at a, and let q ^ a be an element of Q . Let 
m, n be the antipodes of a on V, Q respectively. If / is the foot on V of q then we will 
show that 

d(q,f)/d(q,a) = d(n,m), (35) 

and the first part of §(8.7) follows from the fact that the right side is the same for all 
points q E Q: Let 1Z = P mn , and g the antipode of m on 1Z. Then n is the foot of 
q on 1Z and p(q\n) = 1 — p(q\a) = d 2 {q,a). Also p(g\q) = 1 — p(q\f) = d 2 (q,f) and 
p(g\n) = 1 — p(n\m) = d 2 (n,m). Then (35) follows from p(g\q) = p(g\n)p(n\q) given by 
§(8.3). The second part then follows from §(8.3) noting that p(f\a) — p{a, q)/p(q, /)•• 

With the same notation as in §(8.7), let u be any other point of Q, v its foot on V, 
and w any other point of V. Then 

§(8.8) d(u,v) < d(q,w)/d(q,a). 

Proof: d(u, v) = d{u, a)d(n, m) < d(n, m) = d(q, f)/d(q, a) < d(q, w)/d(q, a).» 

Thus d{u, v) can be made as small as we like by making q close enough to some point 

w. 

Armed with §(8.3) we can now prove a very important lemma: In the following we 
assume S with = 3: 

§(8.9) Refliection of a GPS in a pair of antipodes extends to an isometry of S . 
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Proof: Let V be a GPS in a S, a, a! be a pair of antipodes of V, let a" be the unique 
element such that a, a', a" form a basis of S, and let <2 = V a > a "- Then each z in Q is 
orthogonal to a, and we define the reflection R of 5 to be the reflection of GPS V az in a, z 
for all z E Q. We contend that this is an isometry of S. By §(8.3) it suffices to show that 
if x is the foot in V az of x* G V az * , then the image Rx is the foot in V az of the image Rx* . 
To see this note that a, x, z define a great circle C on P az through a, and a, x*, define 
a great circle C* on V az * through a. By §(8.7) since Rx* is the same distance from a as 
x*, the distance of Rx* to its foot / on V az is the same as that of x* from x, and also the 
distance of / from a is the same as the distance of x from a. But by §(8.6) the set of feet 
of C* lie on C, and the only point of C besides x equidistant from a is the image Rx of x, 
i.e. we must have / = Rx. • 

§(8.10) S is one-point homogeneous, i.e. given any two points x, y there is an isometry 
that maps x into y. 

Proof: We may assume x ^ y. Then x,y define a GPS with antipodes c, d such that 
y is the reflection of x in c, d. Hence the assertion follows from §(8.9). 

§(8.11) Any rotation of a GPS about a pair of antipodes can be extended to an isometry 

ofS. 

Proof: Any rotation of a GPS about a pair of antipodes can be produced by a pair of 
successive reflections. Hence the assertion follows from §(8.9). 
We are now ready to prove pairwise homogeneity for N = 3. 

Using §(8.10) we may first map a to x by an isometry. It then suffices to show that 
if d(a, b) = d(a, y) then there is an isometry holding a fixed that maps b to y. Now using 
§(8.11) there is an isometry that rotates V ay so that y is mapped onto the great circle 
through a containing the foot b* of b on V ay . Thus it suffices to show that for two adjacent 
GPS there is an isometry that maps a great circle through the intersection of one into an 
equivalent great circle of the other in the sense of §(8.6). But this is accomplished simply 
as follows: Let a! be any fixed element orthogonal to a. For each great circle C through 
a and a' consider the set of great circles through a equivalent to C in the sense of §(8.6). 
Since these circles can be mapped isometrically to a 2-sphere as described in the proof of 
§(8.6), it is seen that if C\ is any one of these it will have a unique reflection C* in C. If 
we perform this reflection simultaneously for all C's through a, a', then since the families 
are disjoint we will have produced a well-defined reflection that is also, in view of §(8.9), 
an isometry. By choosing a' judiciously as the midpoint of two points z, z* antipodal to 
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a on equivalent circles, we will thus have produced the required isometry. Combining this 
with §(6.1) we have 

§(8.12) Spaces S with N = 3 are pairwise homogeneous, M-convex metric spaces. 
This is a very important conclusion because of the existence of a theorem due to Wang 



14] informing us that a pairwise homogeneous, M-convex metric space for N = 3 must 



be one of the spaces 71, C, <2, Cay which are the projective planes over the real, complex, 
quaternion, or octonion (Cay ley) numbers respectively. The corresponding rank of S will 
be 1,2, 4, and 8 respectively. (The reader should not be confused by the use of the word 
"plane". In projective geometry planes have three coordinates.) Note that the result we 
have obtained is much stronger than the result of quantum logic which only informs us that 
we have some sort of projective plane. Those that we have obtained are a very restricted 
class that can be coordinatized by numbers which are almost fields. The quaternions and 
octonions lack commutativity, and the octonions only obey a restricted form of associativity 
(alternative associativity). From the point of view of projective geometry all of these 
planes enjoy a restricted form of "transitivity" , i.e. there is a large but not exhaustive set 
of collineations and correspondingly a large but not exhaustive set of configurations for 
which Desargues theorem holds. In the jargon of projective geometry they are said to be 
"Moufang planes" . 

IX. DIMENSIONS N > 3. 

We could approach the problem of N > 3 by imitating what we did for N = 3, i.e. 
we could demonstrate that the space is pairwise homogeneous and then invoke Wang's 
theorem. However, as in the case of N = 2 there is a somewhat more direct approach 
that takes advantage of the fact that for N > 3 the representation theorem of projective 
geometry is very strong. When combined with the metric properties we have already 
derived it will suffice to give us all that we could get using Wang's theorem. Moreover 
this approach enables us to show how coordinates are actually introduced and to relate 
the topology of the coordinates to the d-metric. 

First let us establish that S is a projective space. To see this define the term "line 
joining a, 6" to mean V a b- If a, b are distinct and z ^ V a b we construct a "plane" containing 
the three points as we did in §(8.3) where we constructed the foot of z on V a b- Now 
referring to §(5.4) one sees that the subspace spanned by a, 6, z will obey all of the axioms 
for a projective plane QTj] , @] provided that there is at least one more point not on any 
of the lines Vaz-,V z b,Vab- This will be guaranteed by the connectedness property. This 
construction applies to all dimensions and we conclude that S is a projective geometry. 
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Now it is known |l| , [0 that for N > 3 Desargues theorem holds throughout the 
space and in consequence the space can be coordinatized by a skew-field, i.e. all of the 
properties of a field hold except that multiplication need not be commutative. Now it is 
also known from a theorem of Pontriagin |15| that the only topological skew fields are 
71, C and Q. We therefore only have to show that the coordinates respect the topology 
of the d-metric and thereby deduce the same result that would be obtained from Wang's 
theorem for N > 3, i.e. that we must have one of these three projective spaces. The 
coordinatization procedure to be described next is standard . 

Since every two dimensional subspace is a sphere of the same rank to, we can in- 
dividually coordinatize each such subspace with to real variables e.g. polar latitude and 
longitude variables that are continuous in the <i-metric. Our task is to extend this to the 
whole three dimensional space. We proceed in the following manner || : For convenience 
write xy = P xy and call it the "line joining x and y". Special reference points will be 
indicated by capital rather than lower case letters. Let 0,X,Y be a basis of S with O 
called the "origin" and Y called the "point at infinity". OX is called the X-axis, OY is 
called the Y-axis, and XY the line at infinity. Select some arbitrary point I not on any of 
the lines OX,OY,XY. For any point P not on XY let YP intersect 01 at x and XP intersect 
01 at y. Then we coordinatize P by the pair (x,y). The line 01 then has the equation 
y = x, the points of OX have y = O and of OY have x = O. (Note that this is the point 
O, not the number zero.) Note also that O — > (O, O) and / — > (/, I). Now every point q 
on XY other than Y is the intersection of a line from O through some point of the form 
(J, to). We write q — > (to). To Y we assign the arbitrary symbol (oo). 

Now let (0,b) be the intersection on OY of the line through (to) and (x,y). If this 
were a Cartesian plane with the letters indicating real numbers we would have y = mx + b. 
In general since y is determined by x, to, b we write: 



The function T with three arguments is called a ternary operation. We may now 
introduce the convenient definitions: 



We must next establish that with these definitions the operations + and ■ have the group 
and distributivity properties required to define a field of numbers. It is a remarkable fact, 



y = Tim, x, b). 



(36) 



to • x = T(to, x, O), x + b = T(I, x, b). 



(37) 
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however, [II , that the more special cases of Desargues theorem that one can identify 
as holding in the plane, the closer one can come to a field. In particular if it is known 
that if the plane can be embedded in a higer dimensional space then one can show (as 
Desargues himself did!) that the Desargues theorem holds unrestrictedly. In particular 
one can demonstrate that [0 , [|J 

T(m, x, y) = T(I,T(m, x,0),y), (38) 

which means that 

T(m, x,y) = m ■ x + y. (39) 

Projective planes with this property are called "linear" . Moreover one can show that 
addition and multiplication have the group and distributivity properties of a skew-field, 
the term "skew" meaning that multiplication need not be commutative. We now show 
that this is a topological field in that + and • are continuous operations in the sense of the 
(i-metric. 

§(9.1) Addition and multiplication are continuous in the d-metric. 

Proof: Consider a line L joining (J) and (x, y). Let it intersect OY at U = (O, b). By 
§(8.8) the line L comes arbitrarily close to the line y = x joining (I) and O if U is made 
close enough to O. But y = x + b by definition and so we have proved that (x, y) is as close 
as we wish to (x, x) if b is sufficiently close to O in the topology of OY. This proves the 
continuity of "+". The argument for "• " is similar and will be omitted.* 



We may now invoke the theorem of Pontriagin [|T5[ that informs us that 7Z, C and 
Q are the only topolgical skew-fields. We thus establish that for N > 3 we must have 
one of these three projective spaces. Note that since all of the listed spaces are pairwise 
homogeneous it follows that 

§(9.2) For all N the space S is a pairwise homogeneous, M-convex, metric space. 

In the quantum logic approach one would at this stage call upon Gleason's theorem 
|| , H to deduce (1) using the fact that p is a frame-function (see (13)). Gleason's theorem 
has two parts the first of which is deep and difficult, and the second part of which is quite 
simple. The first part tells us that for N > 2 the frame functions on any of the projective 
spaces over 1Z, C, Q, Cay are continuous in the natural topology of the coordinates. Once 
this is establshed the easy second part deduces that the trace function in (1) is the only 
permissible form for p. Now we see that having already established continuity we do not 
need to invoke the difficult first part of Gleason's theorem but need only the second part to 
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deduce (1). The same remark holds for N = 3. Finally in the case N = 2 where Gleason's 
theorem gives us nothing at all we do not need it because we have already established that 
(27) gives p in terms of the arc length between points on the GPS, and in the particular 
cases where the rank m is 1, 2,4, 8 the spheres coincide with projective lines over 71, C, Q 
and Cay, so that (27) is expressed by (1). 

We have now completed the proof of parts A,B,C,D,E of the main theorem §(4.1). It 
is remarkable that the possible models allowed are precisely the set of Jordan algebras [ |l~0| ! 
The Jordan algebra axioms codify the algebraic manipulations that one performs in conven- 
tional quantum mechanics and were introduced with the idea of discovering generalizations 
of quantum mechanics in the early days when it was not clear that the conventional model 
could accommodate relativity. Of course the Jordan axioms do not by themselves suffice 
to deduce (1) and, moreover, they are abstract axioms that are not tied in any simple way 
to experiment. However, the fact that we have found precisely this list of possibilities is of 
considerable interest because the Jordan algebra axioms generalize to the von Neumann 
algebras and this suggests possible generalizations of our main theorem. 

X. ISOLATING THE CONVENTIONAL MODEL. 

Since the real case is included in the complex case, the problem now arises of discerning 
a physical principle or experiment that isolates the conventional model by eliminating what 
we shall call the exotic cases, i.e. m > 2 for N = 2, Q for N > 2, and Cay for N = 3. 

Let us begin by looking at N = 2. Suppose that on a sphere of rank m we select 
a basis, i.e. a pair of antipodes. For m = 1 the equator opposite is a pair of points and 
for m = 2 it is a circle. The group of isometries leaving the basis fixed in these two non- 
exotic cases is a commutative group, in the first instance the two element group and in 
the second the group of rotations about a fixed axis. But in the exotic cases m > 2 the 
equator is a sphere of rank at least two and so the group of isometries fixing the basis 
is non-commutative. This same observation can be applied for N > 2. For there are 
isometries affecting only two dimensional subspaces (i.e. acting as the identity on the rest 
of the space) and in the exotic cases there will be non-commuting isometries that leave 
a basis of the two dimensional subspace invariant. On the other hand in the non-exotic 
cases the isometries that fix a basis are obtained by exponentiating hermitian operators for 
which this basis is an eigenbasis. Since simulataneously diagonalizable hermitian operators 
commute the isometries will commute. We thus have the last part of the main theorem: 
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§(4.1 F) A necessary and sufficient condition for the exclusion of exotic models is 
that isometries fixing all the elements of the same basis commute. 

The question that we must now confront is the physical significance of the property 
contained in §(4.1 F). In conventional dynamics time evolution U(t) is associated with 
a certain basis, the eigen-basis of the Hamiltonian H which is invariant under U. Any 
operator that leaves this basis invariant is associated with an integral of the motion and 
commutes with H. Now we see that in the exotic cases we may have operators that leave 
that basis invariant but which fail to commute with U or (with H). It is particularly in- 
teresting to consider the quaternionic case m = 4 which is the only one of the exotic cases 
that can occur for all values of N. If we select a basis which is to be left invariant under 
time evolution, then there will be a three-parameter, non- commutative continuous group of 
transformations that leave this basis invariant. Thus we could replace the notion of "time" 
with a three parameter quantity t = (^1,^2,^3), and expressions like e~ tEt appearing in 
the evolution of wave functions would be replaced by e ~ lEltl ~i E2t2 ~ kEata with quater- 
nionic units k. Thus in a world with this kind of quantum mechanics both space and 
time would be three dimensional and energy like momentum would be a three-component 
quantity. 

As we remarked earlier any system described by GPS with rank m > 2 will also have 
additional elements in the equatorial decomposition and could be identified experimentally 
if they existed in this direct way. For A > 2, the exotic cases can also be detected 
experimentally by means of the following construction which we illustrate for A = 3. 
Consider Figure 6: 

The points 01,02,03 lie on a GPS "A" and the points 61,62,63 lie on another GPS 
"B" both of which are in the same A = 3 space. The GPS V ai b 2 ^ s indicated by a line as 
are the other five GPS's formed by pairings with distinct subscripts. The intersection of 
the indicated GPS's are indicated by x, y, z. It is a consequence of the theorem of Pappus 
in projective geometry over a field that the three dark circles will be collinear. Thus in the 
conventional model where we have the field of complex (or real) numbers, these three will 
lie on a single GPS. For the exotic cases they will not in general lie on the same GPS. 

XI. CONCLUSIONS. 

Our main theorem §(4.1) shows that the elementary indistinguishability properties of 
mixtures are sufficient fo imply that the only possible model of quantum mechanics is given 
either by (1) or a few exotic relatives. The results obtained in this way are stronger than 
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those obtained by the quantum logic of Birkhoff and von Neumann in that restrictions are 
obtained for N = 3 and even for N = 2 where quantum logic and Gleason's theorem give 
no constraint at all. We have also described some direct experimental tests for the low 
dimensional exotic cases and indicated the peculiar dynamical consequences of the exotic 
variants in arbitrary dimensions. 

Appendix 



For conventient reference we here reproduce the derivation JTy] 1 , [16|2 of the relation- 
ship between p and the <i-metric in the conventional model and locally realistic (hidden 
variable) theories. From (1): 

sup \Tr(ir(x)7r(z)) — Tr(ir(y)7r(z))\ = sup | (^|7r(x) — %(y)\z)\. (40) 

Z Z 

But this is just the largest eigenvalue of n(x) — 7r(y). Since the 7r's are projectors: 

(tt(s) - n{y)f = (1 - \{x\y)\ 2 )(7r(x) - n{y)) (41) 

and one reads off the largest eigenvalue to obtain (25). 

If a locally realistic theory is such that there is agreement between its predictions for 



various methods of state preparation |1J|1 one has a set A with a measure \i such that 



p(x\y)=fi(A(x)nA(y)), /x(A(x)) = l,Vx. (42) 

To evaluate (5) we must compute the supremum over z of. \fi(A(x)nA(z))— fi(A(y)r\A(z))\. 
But we note that the contribution coming from any overlap of A(x) and A(y) will cancel. 
Hence one can compute the z maximizing the expression as if the sets are disjoint. This 
occurs when either z = x or z = y and gives 1 — fx(A(x) H A(y)) whence 

d{x,y) = 1 -p{x\y). (43) 

Comparing with (25) one sees the incompatibility between quantum mechanics and locally 
realistic theories due to the absence of the square root except in the classical situation 
where p assumes only values 0,1. 
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